Skip to main content

Infrastructure

Introduction

This guide describes the errors that you can encounter with the infrastructure Dais runs on, and how to troubleshoot them.

NOTE: While Dais will report this category of error, these are not Dais errors; the root causes of these issues are infrastructure-related.

NOTE: When reporting an error, please remember to include:

  • The steps to reproduce the issue.
  • Timestamp (with timezone).
  • App Id and Region (if known).
  • Any other details that could assist with troubleshooting.

External Component Errors

File System Errors

Reported as FileSystemError with HTTP status code 403 or 503.

Troubleshooting Guide

Depending on the CLOUD_BUCKET_STORAGE_PROVIDER used in the deployment, the troubleshooting steps may be different, but the common steps are:

  • Ensure the valid service account key is set for the deployment:
    • azure-storage-account-connection-string.key
    • gcp-storage.json
  • Ensure the SA keys are mounted/passed to the services.
    • Azure: the 'secretRef' with the Kubernetes secret azure-storage-creds
    • GCP: The SA key should be mounted as volume from the secret cloud-storage-credentials

For 503 errors infrastructure logs should be used for further troubleshooting root cause discovery.

Database Errors

Reported with HTTP status code 503.

Troubleshooting Guide

For database connection issues check the following:

  • The database is running and accessible via other tools (e.g., psql).
  • The IP address and credentials of the database specified during setup are correct.
  • The port the database is available on matches the one used in the Dais deployments values.yaml.
  • The firewall and network policies in use allow connections to the database from Kubernetes services.

App and Docker Registry Errors

Reported as AppError or DockerRegistryError with HTTP status code 403 or 503. Most often encountered when building persistent-services or models.

Troubleshooting Guide

Depending on the PROJECT_SERVICES_BUILD_CACHE_DOCKER_REGISTRY_AUTH_TYPE set during deployment you will need to check:

  • For GCP the credentials are set in the gcp-build-cache-registry-credentials Kubernetes secret, based on the gcp-cache-registry-key.json deployment file.

  • For 'basic' verification the credentials are set in:

    • PROJECT_SERVICES_BUILD_CACHE_DOCKER_REGISTRY_USERNAME and PROJECT_SERVICES_BUILD_CACHE_DOCKER_REGISTRY_PASSWORD

    • PROJECT_SERVICES_BASE_DOCKER_ARTIFACTORY_REGISTRY_USERNAME and PROJECT_SERVICES_BASE_DOCKER_ARTIFACTORY_REGISTRY_PASSWORD

Internal Dais Components Errors

Internal Dais component errors typically occur at the Core and App levels rather than the Infrastructure level. Troubleshooting guides that detail how to diagnose and resolve errors are these levels are currently in development.